Support Vector Machines, Kernel Logistic Regression and Boosting

نویسندگان

Ji Zhu

Trevor J. Hastie

چکیده

The support vector machine is known for its excellent performance in binary classification, i.e., the response y ∈ {−1, 1}, but its appropriate extension to the multi-class case is still an on-going research issue. Another weakness of the SVM is that it only estimates sign[p(x) − 1/2], while the probability p(x) is often of interest itself, where p(x) = P (Y = 1|X = x) is the conditional probability of a point being in class 1 given X = x. We propose a new approach for classification, called the import vector machine, which is built on kernel logistic regression (KLR). We show on some examples that the IVM performs as well as the SVM in binary classification. The IVM can naturally be generalized to the multi-class case. Furthermore, the IVM provides an estimate of the underlying class probabilities. Similar to the “support points” of the SVM, the IVM model uses only a fraction of the training data to index kernel basis functions, typically a much smaller fraction than the SVM. This can give the IVM a computational advantage over the SVM, especially when the size of the training data set is large. We illustrate these techniques on some examples, and make connections with boosting, another popular machine-learning method for classification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Gradient-based Forward Greedy Algorithm for Sparse Gaussian Process Regression

In this chaper, we present a gradient-based forward greedy method for sparse approximation of Bayesian Gaussian Process Regression (GPR) model. Different from previous work, which is mostly based on various basis vector selection strategies, we propose to construct instead of select a new basis vector at each iterative step. This idea was motivated from the well-known gradient boosting approach...

متن کامل

A Gradient-Based Forward Greedy Algorithm for Space Gaussian Process Regression

متن کامل

Boosting and Bagging of Neural Networks with Applications to Financial Time Series

Boosting and bagging are two techniques for improving the performance of learning algorithms. Both techniques have been successfully used in machine learning to improve the performance of classification algorithms such as decision trees, neural networks. In this paper, we focus on the use of feedforward back propagation neural networks for time series classification problems. We apply boosting ...

متن کامل

Do we need hundreds of classifiers to solve real world classification problems?

We evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearestneighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regre...

متن کامل

Margin Maximizing Loss Functions

Margin maximizing properties play an important role in the analysis of classi£cation models, such as boosting and support vector machines. Margin maximization is theoretically interesting because it facilitates generalization error analysis, and practically interesting because it presents a clear geometric interpretation of the models being built. We formulate and prove a suf£cient condition fo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Support Vector Machines, Kernel Logistic Regression and Boosting

نویسندگان

چکیده

منابع مشابه

A Gradient-based Forward Greedy Algorithm for Sparse Gaussian Process Regression

A Gradient-Based Forward Greedy Algorithm for Space Gaussian Process Regression

Boosting and Bagging of Neural Networks with Applications to Financial Time Series

Do we need hundreds of classifiers to solve real world classification problems?

Margin Maximizing Loss Functions

عنوان ژورنال:

اشتراک گذاری